PCI Express to PCI Express based low latency interconnect scheme for clustering systems

ABSTRACT

PCI Express is a Bus or I/O interconnect standard for use inside the computer or embedded system enabling faster data transfers to and from peripheral devices. The standard is still evolving but has achieved a degree of stability such that other applications can be implemented using PCIE as basis. A PCIE based interconnect scheme to enable switching and inter-connection between external systems, such that the scalability can be applied to enable data transport between connected systems to form a cluster of systems is proposed. These connected systems can be any computing or embedded system. The scalability of the interconnect will allow the cluster to grow the bandwidth between the systems as they become necessary without changing to a different connection architecture.

CROSS REFERENCE TO PRIOR APPLICATION

This application is a continuation of Ser. No. 11/242,463 filed on Oct.4, 2005 by the same inventor.

FIELD OF INVENTION

This invention relates to cluster interconnect architecture forhigh-speed and low latency information and data transfer between thesystems in the configuration

BACKGROUND AND PRIOR ART

The need for high speed and low latency cluster interconnect scheme fordata and information transport between systems have been recognized asone needing attention in recent times. The growth of interconnected anddistributed processing schemes have made it essential that high speedinterconnect schemes be defined and established to provide the speed upthe processing and data sharing between these systems.

There are interconnect schemes that allow data transfer at high speeds,the most common and fast one existing today is the Ethernet connectionallowing transport speeds from 10 MB to as high as 10 GB/sec. TCP/IPprotocols used with Ethernet have high over head with inherent latencythat make it unsuitable for some distributed applications. Effort isunder way in different areas of data transport to reduce the latency ofthe interconnect as this is a limitation on growth of the distributedcomputing power.

What is Proposed

PCI Express (PCIE) is an emerging I/O interconnect standard for useinside computers, or embedded systems that allow serial high speed datatransfer to and from peripheral devices. The typical PCIE provides 2.5GB transfer rate per link (this may change as the standard and datarates change). Since the PCIE standard is starting become firm and usedwithin the systems, what is disclosed is the use of PCIE standard basedperipheral as an interconnect between individual stand-alone systems,typically through an interconnect module to PCIE based peripheralconnected directly using data links, as an interconnect betweenstand-alone systems, typically through an interconnect module or anetwork switch (switch). This interconnect scheme by using only PCIEbased protocols for data transfer over direct physical connection linksbetween the PCIE based Peripheral devices (see FIG. 1), without anyintermediate conversion of transmitted data stream to other datatransmission protocols or encapsulation of the transmitted data streamwithin other data transmission protocols, reduces the latencies ofcommunication in a cluster. The PCIE standard based peripheral at aperipheral endpoint of the system, by directly connecting using PCIEprotocol based data links to the PCIE standard based peripheral at theswitch, provides for increase in the number of links per connection asband width needs increase and thereby allow scaling of the band widthavailable within any single interconnect or the system of interconnects.This will allow the interconnect architecture to remain constant as theinterconnect band width need goes from 2.5 GB with a X1 link (singledata link) to much higher values of 10 GB with a X4 link (4 data links),40 GB with a X16 link (16 data links) or a 80 GB with a X32 link (32data links) and so on providing for easy scaling of the multi-systemcluster.

Some Advantages of the Proposed Connection Scheme:

-   -   1. Reduced Latency of Data transfer as conversion from PCIE to        other protocols like ethernet is avoided during transfer.    -   2. The number of links per connection can scale from X1 to        larger numbers X32 or even X64 possible based on the bandwidth        needed.    -   3. Minimum change in interconnect architecture is needed with        increased bandwidth, enabling easy scaling with need.    -   4. Standardization of the PCIE based peripheral will make        components easily available from multiple vendors, making the        implementation of interconnect scheme easier and cheaper.    -   5. The PCIE based peripheral to PCIE based peripheral links in        connections allow ease of software control and provide reliable        bandwidth.

DESCRIPTION OF FIGURES

FIG. 1—Typical Interconnected (multi-system) cluster (shown with eightsystems connected in a star architecture using direct connected datalinks between PCIE standard based peripheral to PCIE standard basedperipheral)

FIG. 2—A cluster using multiple interconnect modules or switches tointerconnect smaller clusters.

EXPLANATION OF NUMBERING AND LETTERING IN THE FIG. 1

-   (1) to (8): Number of Systems interconnected in FIG. 1-   (9): Network Switch (switch) sub-system.-   (10): Software configuration and control input for the switch.-   (1 a) to (8 a): PCI Express based peripheral module (PCIE Modules)    attached to systems.-   (1 b) to (8 b): PCI Express based peripheral modules (PCIE Modules)    at switch.-   (1L) to (8L): PCIE based peripheral module to PCIE based peripheral    module connections having n-links (n-data links).

EXPLANATION OF NUMBERING AND LETTERING IN THE FIG. 2

-   (12-1) and (12-2): clusters-   (9-1) and (9-2): interconnect modules or switch sub-systems.-   (10-1) and (10-2): Software configuration inputs-   (11-1) and (11-2): Switch to switch interconnect module in the    cluster-   (11L): Switch to switch interconnection

DESCRIPTION OF THE INVENTION

PCI Express is a Bus or I/O interconnect standard for use inside thecomputer or embedded system enabling faster data transfers to and fromperipheral devices. The standard is still evolving but has achieved adegree of stability such that other applications can be implementedusing PCIE as basis. A PCIE based interconnect scheme to enableswitching and inter-connection between external systems, such that thescalability can be applied to enable data transport between connectedsystems to form a cluster of systems is proposed. These connectedsystems can be any computing or embedded system. The scalability of theinterconnect will allow the cluster to grow the bandwidth between thesystems as they become necessary without changing to a differentconnection architecture. FIG. 1 is a typical cluster interconnect. TheMulti-system cluster shown consist of eight units or systems {(1) to(8)} that are to be interconnected. Each system has a PCI express (PCIE)based peripheral module {(1 a) to (8 a)} as an IO module, at theinterconnect port, with n-links built into or attached to the system.(9) is an interconnect module or a switch sub-system, which has numberof PCIE based interconnect modules equal to or more than the number ofsystems to be interconnected, in this case of FIG. 1 this number beingeight {(1 b) to (8 b)}, that can be interconnected for data transferthrough the switch. A software based control input is provided toconfigure and/or control the operation of the switch. Link connections{(1L) to (8L)} attach the PCIE based peripheral modules on therespective systems to those on the switch with n links. The value of ncan vary depending on the connect band width required by the system.

When data has to be transferred between say system 1 and system 5, inthe simple case, the control is used to establish an internal linkbetween PCIE based peripheral modules 1 b and 5 b inside the switch. Thehand shake is established between outbound based PCIE based peripheralmodule (PCIE module) 1 a and inbound PCIE module 1 b and outbound PCIEmodule 5 a and inbound PCIE module 5 b. This provides a throughconnection between the PCI modules 1 a to 5 b through the switchallowing data transfer. Data can then be transferred at speed betweenthe modules and hence between systems. In more complex cases data canalso be transferred and queued in storage implemented in the switch andthen when links are free transferred out to the right systems at speed.

Multiple systems can be interconnected at one time to form amulti-system that allow data and information transfer and sharingthrough the switch. It is also possible to connect smaller clusterstogether to take advantage of the growth in system volume by using anavailable connection scheme that interconnects the switches that form anode of the cluster.

If need for higher bandwidth and low latency data transfers betweensystems increase, the connections can grow by increasing the number oflinks connecting the PCIE modules between the systems in the cluster andthe switch without completely changing the architecture of theinterconnect. This scalability is of great importance in retainingflexibility for growth and scaling of the cluster.

It should be understood that the system may consist of peripheraldevices, storage devices and processors and any other communicationdevices. The interconnect is agnostic to the type of device as long asthey have a PCIE module at the port to enable the connection to theswitch. This feature will reduce the cost of expanding the system bychanging the switch interconnect density alone for growth of themulti-system.

PCIE is currently being standardized and that will enable the use of theexisting PCIE modules to be used from different vendors to reduce theover all cost of the system. In addition using a standardized module inthe system as well as the switch will allow the cost of softwaredevelopment to be reduced and in the long run use available software toconfigure and run the systems.

As the expansion of the cluster in terms of number of systems,connected, bandwidth usage and control will all be cost effective, it isexpected the over all system cost can be reduced and over allperformance improved by standardized PCIE module use with standardizedsoftware control.

Typical connect operation may be explained with reference to two of thesystems, example system (1) and system (5). System (1) has a PCIE module(1 a) at the interconnect port and that is connected by the connectionlink or data-link or link (1L) to a PCIE module (1 b) at the IO port ofthe switch (9). System (5) is similarly connected to the switch troughthe PCIE module (5 a) at its interconnect port to the PCIE module (5 b)at the switch (9) IO port by link (5L). Each PCIE module operates fortransfer of data to and from it by standard PCI Express protocols,provided by the configuration software loaded into the PCIE modules andswitch. The switch operates by the software control and configurationloaded in through the software configuration input.

FIG. 2 is that of a multi-switch cluster. As the need to interconnectlarger number of systems increase, it will be optimum to interconnectmultiple switches of the clusters to form a new larger cluster. Such aconnection is shown in FIG. 2. The shown connection is for two smallerclusters (12-1 and 12-2) interconnected using PCIE modules that can beconnected together using any low latency switch to switch connection(11-10 and 11-2), connected using interconnect links (11L) to providesufficient band width for the connection. The switch to switchconnection transmits and receives data and information using anysuitable protocol and the switches provide the interconnectioninternally through the software configuration loaded into them.

The Following are Some of the Advantages of the Disclosed InterconnectScheme

-   -   1. Provide a low latency interconnect for the cluster.    -   2. Use of PCIExpress based protocols for data and information        transfer within the cluster.    -   3. Ease of growth in bandwidth as the system requirements        increase by increasing the number of links within the cluster.    -   4. Standardized PCIE component use in the cluster reduce initial        cost.    -   5. Lower cost of growth due to standardization of hardware and        software.    -   6. Path of expansion from a small cluster to larger clusters as        need grows.    -   7. Future proofed system architecture.

In fact the disclosed interconnect scheme provides advantages for lowlatency multi-system cluster growth that are not available from anyother source.

1. An interconnected cluster, the cluster comprising; a PCIE Expressenabled interconnect module comprise a plurality of ports, each saidport having a PCI Express based peripheral module enabled forinterconnection; a switching mechanism in said interconnect moduleenabled to transfer data between a first of said plurality of ports onsaid interconnect module and any of a rest of said plurality of ports onsaid interconnect module; a plurality of devices and systems that arePCI Express enabled; each said plurality of devices and systems compriseat least a system interconnect port comprise a PCIE express peripheralmodule enabled for interconnection; said at least a system interconnectport of each said plurality of devices and systems connect to one ofsaid plurality of ports of said interconnect module using at least a PCIExpress link; a data transfer mechanism to transfer data to and fromeach of said plurality of devices and systems to the one of saidplurality of ports of said interconnect module connected to it, whereinsaid data transfer is done using a PCI-Express protocol; wherein saiddata transfer mechanisms and said switching mechanisms together allowdata received from a first of said plurality of devices and systems tobe sent to any of a rest of said plurality of said plurality of devicesand systems; and wherein said data transfer mechanism and said switchingmechanism together further allows data received from any of said rest ofsaid plurality of devices and systems to be sent to said first of saidplurality of devices and systems; there by enabling the cluster withinterconnection and communication capability between interconnecteddevices and systems using only PCI Express protocol over PCI Expresslinks.
 2. The interconnected cluster in claim 1, where in, each saidplurality of devices and systems is connected to said interconnectmodule by one or more PCI Express data links connected between said PCIExpress peripheral module at said system interconnect port and said PCIExpress peripheral module at said port of said interconnect module. 3.The interconnected cluster in claim 1, where in, the connection betweenthe PCI Express peripheral module at the interconnect port of each ofthe devices and systems and the connected PCI Express peripheral moduleat the port of the interconnect module can be by using multiple datalinks as the bandwidth requirements of the interconnect demand.
 4. Theinterconnected cluster in claim 1, where in, the interconnect module isa network switch.
 5. The interconnected cluster in claim 1, where in,the switching between the ports inside the interconnect module iscontrolled by the configuration software loaded into the interconnectmodule.
 6. An interconnected cluster system comprising of two or moresmaller clusters to form a larger cluster, each of said smaller clustershaving multiple devices and systems and an interconnect module; saidinterconnected cluster system having suitable low latencyinterconnection between said interconnect modules of the small clustersto allow the overall system growth to the interconnected cluster system.7. The interconnected cluster system of claim 6, comprising of two ormore smaller clusters having multiple devices and systems and theinterconnect module each where in, the interconnection betweeninterconnect modules is scalable.
 8. The interconnected cluster systemof claim 6, comprising of two or more clusters each having multipledevices and systems and an interconnect module each where in, thecluster growth can take place without changing the architecture of theindividual small clusters.